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Calculate polarity at axis #12 



Polarity = 1 -0/5= 1.0 



Calculate polarity at axis #13 



Polarity = 1 - 3/5 = 0.4 

i 
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A. HIV2 complete genome 




B. Random sequence 



Extended regions of increased 
polarity. The peaks represent 
regions of 500-600 nucleotides, 
where values of polarity are 
concentrated which deviate 
from the 0.75 expected. 
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1. Enter input sequence of desired length 



I 



2. Enter window size 



I 



3. Axis chosen 3 



I 



4. Calculate and output polarity value 



5. Move to next axis 



I 



6. Calculate and output polarity value around new axis, then 



7. Output ordered list of polarity values 



I 



8. Graph these values 1 



9. Statistical analysis of observed vs. predicted 



I 



1 0. Identify regions of extended polarity 



a Starting at position = (2*window of symmetry) 
b [l-(S/W)] 

c Up to and including axis position = [2*length - (2*window size)] 

d Can use a moving average of values (with number of values averaged and increment 

moving being variable) to smooth curve 
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The aigorythm was implemented in PERL programing language. 
PERL variable-names and function-names are in boldface. 



import input sequence as a string variable $input_seq ) 



prompt user for ($win sym ) length 



I 



I 



H 



m 



cut a length of (2*$win_sym ) bases from the 5" end of$input_seq and assign this 
substring into $win_seq 



perform 
(2*win_sym) 
iterations 



chop (cut last character off a string and return it$win_seq 



] 



unshift chopped character (prepend it to the front of an array, moving all elements one step to 
the right) into an initially empty indexed array @trgt_fwd ) , 



perform 
(2*win_sym 
iterations 



r 



translate $win_seq so that every G or A or T or C of the original string 
become C or T or A or G respectively of the translated string 



chop the translated $win_seq 



push chopped character (stack it onto the end of an array, without moving the other elements) 
onto an initially empty indexed array (5)trgt_revcomp ) 



assign $match_count = 0 



assign index variable $i = 1 



perform 
(win_sym] 
iterations 



compare between the $i th elements of arrays@trgt_fwd and @trgt_revcomp ; if equal then 
increase $match_count by 1 , else nothing. 



increase $i by 1 



calculate $asym count - 1 - ($match_count / win sym) 



push $asym_count onto initially empty indexed array@axis_iist | 



► while $input_seq is not empty 



cut one base off of the 5' end of input_seq and assign it into$basefeed 



translate $basefeed GATC->CTAG respectively and assign the 
translated character into$basefeed_comp 



unshift $basefeed_comp (prepend it to the front of ah array, moving all elements one 
step to the right) into array @trgt_revcomp ; __ 



pop (remove last element of an arrayg>trgt_revcomp 



assign $match_count = 0 



assign index variable $i = 1 



perform 
(win_sym 
iterations 



compare between the $i th elements of arrays@trgt_fwd and @trgt_revcomp ; if equal then 

C incr ease $ma tch_count by 1, else nothing. 



increase 



$i by 1| 



calculate $asym_count = 1 - ($match_count / win sym) | 



push $asym_count onto @axis_list 



shift (remove first element of an array and move all elements one step 
leftward) @trgt_fwd 



push $basefeed onto array @trgt_fwd 



assign $match_count = 0 



assign index variable $i = 1 



perform 
(win_sym 
iterations 



compare between the $i th elements of arrays@trgt_fwd and @trgt_revcomp ; if equal then 



i ncrease $match_cou nt by 1 , else nothing. 



increase $i by 1 



calculate $asym_count = 1 - ($match_count / win„sym) | 



push $asym_count onto @axis_list | 



[save @axis_list to filej 
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101 Enter input sequence of desired length 



I 



102 Enter window size 
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1 03 Axis chosen 8 
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104 Count DPT frequencies, calculate and output DPT residuals and chi square value 



105 and 105* Move to next axis 



106 and 106 f Count DPT frequencies, calculate and output DPT 
residuals and chi square value around new axis, then... b 



1 



107 Save to file: ordered arrays of DPT frequencies, DPT residuals and chi square values 



108 Further statistical analysis of observed vs. predicted 



I 



1 09 Graph values 1 



110 Identify functional elements 



Starting at axis position = (2* window size) 

Up to and including axis position = [2*length - (2*window size)] 

Values include DPT frequencies, statistical measures including residuals and % 
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The'ef jwythirt vas m^cmerttecJ ii PERL programing language. 
PHIL vaiiaNe-nanVes arid f unction-names are in boldface. 



Import npi&jcqueree as a string variable (tinpw*-***!)! 
prompt mer for ($vtr»_»yro) length 



perform 
I {2*vAt_sym 
Iterations 



MalenotMtrtvtn^sytnj bases Irorrtthe Vend of Jinp o*jse* and assign this 

pufrsfjfrig into tvw^tl 

.V fc »♦ p (cut hat charari a off astnr 

i|| htnjl— ft chopped character.(pre 
I he right)Ho an HUaty emptytr 



t stri ng and ret on ») T vin_> ej 

^ Jpend ft to the front olen array, mown a", dements oncstep to 
pryhdexed array ( <»<rgr,_rvd| 



pert turn 
I (2~vln_j>cn 
Icrolioru 



prtoa 



e*vin_*eqsolhat everyGorAorT or C of the engine! string 

ecoroeC or Tor AorGrespettryeryoMhetrarMfaled string 



the translated tviri seq 
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j -- ■ ■;. rt j* . . ' ■ -- -. ~ ■ i'. 

_ chopped characterise* it onto I he end of on array, Nifchoul movrigthe other dements) 
an^iJtyeirtptyaide«dirr^lgtrg*_re yc ' Ma rt 
psdgnt|ff^o«nt=,l, Sga^coqnt - >.„ wesuChv^abtefor^hortht.vtSQPTl;. | 



: ^331 gn aid ex variable ti » Jf| 



p erf. enn 
(v»i_sym, 
lerorjohS 




theti thetemenl s -«t * arrays *>tfgt Ivdand tMrg(_revc^p«eaandG 1 raspettry^ i th*i> 
—eas e tqq^caont, try t. Repeat tfis conrEHonaf oyeraienforcach of the; )i lo^blfcpPIs-, 



ease trpyTl 



. Jfltere^ial*;tsg- res = J gg_co«n» -{{ 111 S| x ivm_*ytni ; Repeal tHs^abtianfcr 
:h of t h e. 1 6 po^tta, f5PU. - 



ash itgi^r.es ;^itog>oo_rcs.Hepe<Jtr« step tor eschar the IcpasabTtPPU. ) 
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-» ^tiie^inpwt^aeqbnot empt^ 

^onetwoe' ott of the yend of iopw<_»eqantf assignftaTto >bajele<d| 



iwslal^ fbaj«lee4GAIC->CTAGr«pectrvc^ahda33igntht 
rarislatedtharacterWo'tba3e<ced_co««*» 



imskift Jbase«««<Jco«p(prependiltothefront of anarray, rnovtigaa;e)en>ert3 ; 6ne 
>tep tot he nqhtj into array gtr.gt jr eye xwp 



»op (remove fast denient of an (gray) l»trg*_rc>>co«npi 



oaityiigg^c oBnt-* I, tga^cowot- Q„. one such variable lor.each of the IS OPisi| 



asign Index variable ti g- 1| , 



^Bthe Si th dements of arrays *?trgt_lvd find ^trat^revtomp areGwd^; reap t^eh^ hen 
petfomi J* Tncrease >gg_co<mt by t.Repe^tris congltional op^raUonfor each & the M possibteDPT^ 



»fculalereiid«3b:igg\.res: - *gg^c«unt -(( 111*1 x ivitt^syml f Kepeal 
hfa^UatTanfor eadh of the 16 possibte D?T», 



Rah IgB-^f onto tgtg'fl^ea: Repeat tHs. step for each of the ,16 posaiSteOPTs 



Jcfct sq - Uibo_res fj V*******?^™.* 

(t gt ~res —21/ (( l# I<1 x $yii%t_sVW»;.r 
(f ffcTre* "21 1 (1 « IS, X >>i5n^»yni| * 
(tag_re» "21 1 H tM 61 x f vtt* j*>*il ♦ 
ltaa_res •♦2Mf( UUJxtSw^yipl * 
ftat re*"21IUUUlxtvw»^iy«l>^«' 
J|ac_rej "2H (( Ut«lx : tyinf^»yrtl ♦ 
I$tg_res -21 / H «.f«l X f virt^syrolr 
ItU res-21l(nHSlxt>rtrt^»y*nI* 
(ftOes "2} I f( U 1 «l x t\*n~?Tfi*i + 
(ttc res "217 U1M6VX tvirt_synil ♦ 
If colres "2) I (( U *v^^»y«| * 
($c*_res "2) I (( If Uj x $v»n_»yml * 
Jtcl_res "21 1 (( » • *lJt t vin^syml 
(tec_res "21 1 jj If 1<1 x tvwt_>yrBU, 



hjmsh tchi_sq onto tjcbi_s^ | 



hilt (remove frrt element of an array and move afl dements one step 
Icflvo/d) gUgM vd 



^as.fc >t>ascf eed onto array ^trgt_f vdj 

^s«gntgg_coBnt= ■, tga_co«nt ° |... one such variable tor each of the tSDPls.| 



fmignindexvariableli » H 
^.Itthe >i th dements of arrays Ptrat fvd m<S *>trot_revc»«np 
perform p lncreasetg g cgwrtby l.Repe -thiscondaional Qperalionfor e 
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